Goto

Collaborating Authors

 brief summary


Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

arXiv.org Artificial Intelligence

Recent advancements in Vision-Language Models (VLMs) have sparked interest in their use for autonomous driving, particularly in generating interpretable driving decisions through natural language. However, the assumption that VLMs inherently provide visually grounded, reliable, and interpretable explanations for driving remains largely unexamined. To address this gap, we introduce DriveBench, a benchmark dataset designed to evaluate VLM reliability across 17 settings (clean, corrupted, and text-only inputs), encompassing 19,200 frames, 20,498 question-answer pairs, three question types, four mainstream driving tasks, and a total of 12 popular VLMs. Our findings reveal that VLMs often generate plausible responses derived from general knowledge or textual cues rather than true visual grounding, especially under degraded or missing visual inputs. This behavior, concealed by dataset imbalances and insufficient evaluation metrics, poses significant risks in safety-critical scenarios like autonomous driving. We further observe that VLMs struggle with multi-modal reasoning and display heightened sensitivity to input corruptions, leading to inconsistencies in performance. To address these challenges, we propose refined evaluation metrics that prioritize robust visual grounding and multi-modal understanding. Additionally, we highlight the potential of leveraging VLMs' awareness of corruptions to enhance their reliability, offering a roadmap for developing more trustworthy and interpretable decision-making systems in real-world autonomous driving contexts. The benchmark toolkit is publicly accessible.


A Brief Summary of Explanatory Virtues

arXiv.org Artificial Intelligence

In this report, I provide a brief summary of the literature in philosophy, psychology and cognitive science about Explanatory Virtues, and link these concepts to eXplainable AI.


Deep learning-based artificial intelligence applications in prostate MRI: brief summary

#artificialintelligence

Prostate cancer (PCa) is the most common cancer type in males in the Western World. MRI has an established role in diagnosis of PCa through guiding biopsies. Due to multistep complex nature of the MRI-guided PCa diagnosis pathway, diagnostic performance has a big variation. Developing artificial intelligence (AI) models using machine learning, particularly deep learning, has an expanding role in radiology. Specifically, for prostate MRI, several AI approaches have been defined in the literature for prostate segmentation, lesion detection and classification with the aim of improving diagnostic performance and interobserver agreement.


Deep learning-based artificial intelligence applications in prostate MRI: brief summary

#artificialintelligence

Prostate cancer (PCa) is the most common cancer type in males in the Western World. MRI has an established role in diagnosis of PCa through guiding biopsies. Due to multistep complex nature of the MRI-guided PCa diagnosis pathway, diagnostic performance has a big variation. Developing artificial intelligence (AI) models using machine learning, particularly deep learning, has an expanding role in radiology. Specifically, for prostate MRI, several AI approaches have been defined in the literature for prostate segmentation, lesion detection and classification with the aim of improving diagnostic performance and interobserver agreement.


A Brief Summary of Interactions Between Meta-Learning and Self-Supervised Learning

arXiv.org Artificial Intelligence

This paper briefly reviews the connections between meta-learning and self-supervised learning. Meta-learning can be applied to improve model generalization capability and to construct general AI algorithms. Self-supervised learning utilizes self-supervision from original data and extracts higher-level generalizable features through unsupervised pre-training or optimization of contrastive loss objectives. In self-supervised learning, data augmentation techniques are widely applied and data labels are not required since pseudo labels can be estimated from trained models on similar tasks. Meta-learning aims to adapt trained deep models to solve diverse tasks and to develop general AI algorithms. We review the associations of meta-learning with both generative and contrastive self-supervised learning models. Unlabeled data from multiple sources can be jointly considered even when data sources are vastly different. We show that an integration of meta-learning and self-supervised learning models can best contribute to the improvement of model generalization capability. Self-supervised learning guided by meta-learner and general meta-learning algorithms under self-supervision are both examples of possible combinations.


Stanford's 2020 AIMI Symposium: A Brief Summary

#artificialintelligence

Session 1 was titled Democratizing Healthcare with AI. This was by far the most intriguing and interesting session, as it had lots of great insights on how to make high-level research available to all. Session 1 began by highlighting the gap between high-level research and the people who actually can benefit from the research. All speakers stressed the importance of developing products from the research that can truly help the field. Some advancements have been made, and some presented include smartphone-powered ultrasounds, smartwatch-based diagnosis of Afib, and even AI-powered dieting apps.


9 Key Machine Learning Algorithms Explained in Plain English

#artificialintelligence

Machine learning [https://gum.co/pGjwd] is changing the world. Google uses machine learning to suggest search results to users. Netflix uses it to recommend movies for you to watch. Facebook uses machine learning to suggest people you may know. Machine learning has never been more important. At the same time, understanding machine learning is hard. The field is full of jargon. And the number of different ML algorithms grows each year. This article will introduce you to the fundamental concepts


The Netherlands Strategic Action Plan for Artificial Intelligence (AI) -- A Brief Summary

#artificialintelligence

On October 2019, sitting amidst the audience at the World AI Summit conference held in Amsterdam, I watched with excitement as the Secretary of State (Ministry of Economic Affairs), Mrs Mona Keijzer, took to the stage to announce the recent launch of the AI Coalition in the Netherlands, and consequently the Strategic AI Action Plan that resulted from the work of its 65 parties. Mrs Keijzer seemed quite optimistic in their ability to compete at the global level. This optimism is indeed justified by key ingredients that make it ready for embracing such a leap forward. For instance, the Netherlands is one of the most data-connected countries in the world, according to DHL research, and on top of this, the country has achieved a strong ecosystem of public-private partnerships (PPP), in addition to a leading European position in high quality research. A McKinsey Report on AI in Europe ranked the Netherlands above average when it comes to AI readiness, with top 25% scores for Automation, Digital Readiness and Innovation.


A Brief Summary of Maths Behind RNN (Recurrent Neural Networks)

#artificialintelligence

In a feedforward neural network, we have X(input) and H(Hidden) and y(output). We can have as many hidden layers as we want but weights (W)for every hidden layer are and the weights for every neuron corresponding to the input are different. Above we have weights Wh0 and Wh1, which corresponds to two different layers, while Wh00, Wh01 and so on, represent different weights corresponding to different neuron and with respect to the input. The RNN cell contains a set of feed forward neural networks cause we have time steps. The RNN has sequential input, sequential output, multiple time-steps, and multiple hidden layers. Unlike FFNN, here we calculate hidden layer values not only from input values but also previous time step values and Weights ( W) at hidden layers are the same for time steps. Here is the complete picture for RNN and its Math.